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(57) Abstract 

A method for producing a 
mature protein in yeast trans- 
formed to express a correspond- 
ing precursor, wherein the ma- 
ture protein sequence is con- 
tained in the precursor and is 
flanked proximally or both 
proximally and distally by a 
pair or triplet of basic amino ac- 
id residues. The method com- 
prises proteolytic processing by 
an endopeptidase and exopepti- 
dase present in the yeast. Yeast 
transformed by a plasmid con- 
taining a cDNA sequence en- 
coding bovine preproparathy- 
roid hormone is also disclosed* 
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PRODUCTION OF MATURE PROTEINS 
IN TRANSFORMED YEAST 

BACKGROUND OF THE INVENTION 
1. Field of the Invention. 
5 This invention relates to a method for 

producing a mature protein in transformed yeast 
and further relates to Saccharomyces cerevisiae 
transformed by a plasmid containing a 
preproparathyroid hormone cDNA insert. 
10 2. Description of the Prior Art. 

Recombinant DNA technology now makes it 
possible to isolate specific genes or portions 
thereof from higher organisms , such as man and 
other animals, and to transfer the genes or fragments 
15 to a microorganism species, such as E^ cbli or 
yeast. The transferred gene is replicated and 
propagated as the transformed microorganism may 
become endowed with the capacity to make whatever 
protein the gene or fragment encodes, whether it 
20 be an enzyme, a hormone, an antigen or an antibody, 
or a portion thereof. The microorganism passes 
on this capability to its progeny, so that in 
effect, the transfer results in a new strain, 
having the described capability. 
25 Recombinant DNA conventionally utilizes 

transfer vectors. A transfer vector is a DNA 
molecule which contains genetic information 
which insures its own replication when transferred 
to a host microorganism strain. Plasmids are an 
30 example of a transfer vector commonly used in 

genetics. Although plasmids have been used as the 
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transfer vectors for the work described herein, 
it will be understood that other types of transfer 
vectors may be employed. Plasmid is the term 
applied to any autonomously replicating DNA unit 
5 which might be found in a microbial cell, other 
than the genome of the host cell itself. A 
plasmid is not usually genetically linked to the 
chromosome of the host cell. Plasmid DNA exists 
as double stranded ring structures generally on 
10 the order of a few million daltons molecular 

weight, although some are greater than 10 8 daltons 
in molecular weight. They usually represent only 
a small percent of the total DNA of the cell. 
Transfer vector DNA is usually separable from host 
15 cell DNA by virtue of the great difference in size 
between them. Transfer vectors carry genetic 
information enabling them to replicate within 
the host cell. 

Plasmid DNA exists as a closed ring. 
20 However, by appropriate techniques, the ring may 

be opened, a fragment of heterologous DNA inserted, 
and the ring reclosed, forming an enlarged molecule 
containing the inserted DNA segment. 

Transfer is accomplished by a process known 
25 as transformation. During transformation, host 
cells mixed with plasmid DNA incorporate entire 
plasmid molecules into the cells. Once a 
cell has incorporated a plasmid, the latter is 
replicated within the cell and the plasmid replicas 
30 are distributed to the progeny cells when the cell 
divides . 

Genetic information contained in the 
nucleotide sequence of the plasmid DNA, including 
heterologous DNA inserted into the plasmid, can 
35 in principle be expressed in the host cell. The 
inserted heterologous DNA typically representing 
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a single gene, is expressed when the protein 
product coded by the gene is synthesized by the 
organism. 

Once a gene has been isolated, purified 
and inserted into a plasmid or other vector, the 
5 availability of the gene in substantial quantity 
is assured. After transfer of the vector into a 
suitable microorganism, the gene replicates as 
the microorganism proliferates. The vector 
containing the gene is easily purified from 

10 cultures of the host microorganism by known tech- 
niques and separable from the vector by restriction, 
endonuclease cleavage followed by gel electrophoresis. 
The protein product expressed by the heterologous 
gene can also be recovered in substantial quanU-ties 

15 from cultures of the host microorganism by harvesting 
the culture and retrieving the protein product 
contained in the harvested cells. (For futher 
detail of recombinant DNA technology, and an explicit 
exposition of the utility of producing proteins 

20 such as hormones, etc., by recombinant DNA technology, 
see U.S. Patent No. 4,237,224, issued December 2, 1980 
to Cohen et al., and U.S. Patent No. 4,322,499, 
issued March 30, 1982 to Baxter et al. Patents 
and articles cited herein are incorporated by 

25 reference wherever such citations occur and shall 
be' considered incorporated in their entirety as if 
set forth in full) 

Recombinant DNA thus holds great promise 
for economically producing substantial quantities 

30 of useful proteins that are difficult or costly to 
* isolate in such quantitites from mammalian tissue. 

A major and nearly universal problem in producing 
useful proteins, however, is the construction of 
the actual genetic material to be inserted into the 

35 transfer vector. 
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Conventional means provide for enzymatically 
preparing desired genetic material by reverse 1 
transcription. Mature messenger RNA (mRNA) , which 
is chemically similar to DNA and retains most of 
5 the information coded in DNA, can be extracted 

from tissue in which the desired gene is active. 
mRNA is separated from other RNA material in the 
tissue and complementary DNA (cDNA) is produced 
by the enzyme reverse transcriptase, and at times 
10 polymerase 1 for the synthesis of the second strand. 
This cDNA, a complementary copy of mRNA and - 
similarly containing the information coded in 
RNA, is often further altered in known ways to 
be suitable for insertion into a plasmid vector. 
15 (See W. Mahoney & S. Henikoff, Univ. of Washington 
Medicine , Vol. 8, No. 4, pp. 6-14 (Winter, 1981)). 

cDNA enzymatically prepared by reverse 
transcription has the potential to express a 
protein chain identical to the protein expressed 
20 by tissue from which the mRNA was extracted. This 

alone is not sufficient, however, for the expression 
of desired mature animal proteins because many 
animal proteins, represented by such diverse 
classes as hormones, binding proteins, enzymes, 
25 antibodies, and collagen, are produced in nature 
in the form of larger precursors that are 
subsequently modified by cleavage to smaller 
bio active forms commonly designated mature 
proteins. Thus, expression of cDNA synthesized 
30 by reverse transcription only has the potential 

to express the precursor of the mature protein < 
product. 

It has been known for several years that 
bacteria such as coli can remove the "pre" 
35 portion of its own secreated proteins. Examples 
include the processing of pre-ribose binding 

WIPO 
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protein , pre-galactose binding protein and 
pre-arabinose binding protein* {L. Randall, et al., 
Eur, J. Biochem. , Vol. 92,. pp. 411-415 (1978); 
L. Randall, S. Hardy, and L. Josef sson, Proc.' Natl. 
5 Acad. Sci. USA , Vol. 75, pp. 1209-1212 (1978)). 

S. Chan, et al., Proc. Natl. Acad. Sci. USA, 
Vol. 78, pp. 5401-5405 (1981) has exploited the 
ability of E^ coli to remove the "pre" sequence. 
Chan, et al., modified cDNA for human preproinsulin 
10 to encode a hydrid "pre" sequence containing 

portions of E^ coli and mammalian "pre" sequence. 
E. coli expressed the hydrid protein and correctly 
removed the "pre" sequence by intra-cellular 
processing. Thus, Chan, et al., was able to modify 
15 human preproinsulin cDNA in a way that would allow 
E. coli to produce proinsulin. 

It is also known that yeast shares the 
ability to remove "pre" sequences from its own 
pre-proteins . Furthermore, when an E^ coli 
20 preprotein was genetically engineered into yeast, 
pre-B-lactamase was processed to B-lactamase. 
(Roggenkamp, et al. , Proc. Natl. Acad. Sci. 
USA , Vol. 78, No. 7, pp. 4466-4470 (1981)). 

The above type of processing of preproteins, 
25 however, will not process to mature proteins 
many of the mammalian hormone precursors and 
many of the other interesting mammalian protein 
precursors in coli . These latter hormone and 
protein precursors contain a "pro" portion which 
30 is not processed by the enzymatic mechanism 

responsible for processing the. "pre" portion of 
preproteins. As shown above, for example, the 
natural precursor for insulin, i.e. preproinsulin 
is processed in E^ coli to form proinsulin. 
35 Many investigators have been unable to 

express pre-proteins in yeast or coli , let 



OMPI 



O8401 173 ffile:/ A\dcwas03 \ firrr>data\lp\Fo leyPat\PatentDocum ents\WO8401 173.CPC ] _ Pag e 8 of 29 

_==^...- • ( ■ r - - 

WO 84/01173 PCT/US83/01361 



-6- 



alone get processing. Expensive and time consuming, 
investigative efforts have focused almost 
exclusively on genetically eliminating the 
"pre* 1 sequences and the "pro" sequences in 
5 attempting to express mature proteins without 
intermediates . 

In several prior art approaches, the need 
for processing precursor proteins has been over- 
come* Insulin is the result of natural processing 
10 in human tissue involving cleaving two peptide 

chains, A and B, from the single large precursor 
preproinsulin and assembling the A and B chains 
to form the mature hormone insulin. The A and 
B chains are located within proinsulin and hence 
15 E^ coli which processes preproinsulin to proinsulin 
does not produce the mature hormone insulin. An 
approach to obtaining mature insulin using 
E» coli employs chemically synthesized genes 
compatible with E^ coli . 
20 A double-stranded synthetic DNA-coding 

sequence for the insulin A chain was synthesized 
chemically from fundamental nucleotide units to 
yield the correct coding sequence. An extra 
amino acid (methionine) was added at one end* 
25 This end was fused to the bacterial gene for the 

enzyme B-galactosidase which results in accumulations 
of fused B-galactosidase-insulin-A-chain protein. 
This same procedure was repeated for the B-chain 
which resulted in the production of fused B-galactosi- 
30 dase-insulin-B-chain protein. 

The fused proteins are insoluble in water 
and readily isolated from broken cells. The A and 
B chains of insulin are released from B-galactosidase 
at the extra methionine by cyanogen bromide cleavage 
35 and subsequently mixed together under conditions 
that allow formation of disulfide bonds between A 
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and B chains/ yielding mature insulin. (W. Miller, 
J, of Pediatrics , Vol. 99, pp.. 1-15 (1981); 
D. Goeddel, et al., Proc. Natl.' Acad. Sci. USA , Vol 76, 
pp. 106-110 (19 79)). 
5 The above prior art approach overcomes the 

need for processing a precursor protein, but in 
turn requires processing of the fused B- 
galactosidase-insulin-A-chain and B-galactosidase- 
insulin-B-chain proteins to mature insulin. 
10 Moreover, chemical synthesis of the DNA coding 
sequences for A-chain*and B-chain involves 
substantial costs, even when considering that 
the B-galactosidase-insulin-A-chain gene and 
B-galactosidase-insulin-B-chain gene after being 
IS synthesized are easily replicated for subsequent 
production of insulin. (D. Williams, et al., 
Science , Vol. 215, pp. 687-689 (Feb. 1982); 
W. Mahoney, Univ. of Wash . Medicine , supra) . 

The approach of chemically synthesizing DNA 
20 encoding for mature proteins has also been shown to 
be effective for bacterial production of human 
soma testation* (K. Itakura, et al., Science , 
Vol. 198, pp. 1056-1063 - (1977) ) . However, insulin 
chains A and B and human somatostatioh are 
25 relatively small sequences and chemically 

synthesized DNA coded for them are relatively 
small. In the case of larger proteins, chemical 
synthesis of the DNA coding sequence coded for 
such proteins is prohibitively time consuming. 
30 One prior art approach, now often followed, 

utilizes chemically synthesized DNA in conjunction 
with enzymatically prepared cDNA to produce a 
gene which instructs production of mature hormone 
in bacteria. Human growth hormone (HGH) is a 
35 protein of 191 amino acids, its precursor having 
an additional 26 amino acid "pre" portion. cDNA 
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encoding the precursor was enzymatically prepared 
from iaRNA isolated from human pituitary tissue. 
The first useful cleavage site of the cDNA occurs 
at the site encoding amino acid residues 23-24 of 
5 HGH. Treatment of the cDNA with restriction 
endonuclease Hae III gives a DNA fragment of 
551 base pairs which includes coding sequences for 
amino acids 24-191 of HGH, A gene fragment 
having coding sequences for residues 1-23 of HGH 
10 (and an initiation codon) was chemically synthesized. 
The two DNA fragments were combined to form a 
synthetic-natural hybrid gene which when inserted 
into a plasmid vector directed expression of 
mature HGH in E. coli. (D. Goeddel, et al., 
15 Nature , Vol. 281, pp. 544-548 (October 1979)). 

Using a similar strategy of cleavage and 
reconstriction of DNA for the mature protein, 
R. Lawn et al . , Nucleic Acids Research , Vol . 9 , 
No. 22, pp. 6103-6114 (1981), expressed mature 
20 human albumin in coli . 

This general approach, however, requires 
time consuming chemical synthesis of desired gene 
fragments, cleavage of cDNA assuming the availa- 
bility of useful cleavage sites and difficult 
25 genetic construction of plasmids from DNA fragments. 
Furthermore, in both of the above examples, an 
initiator methionine was left at the NH 2 -terminal. 
The initiator methionines cannot practically" be 
removed since HGH and albumin also have methionines 
30 located elsewhere in the sequence. Thus, removing 

the initiator methionine by cayanogen bromide * 
cleavage, would result in cleavage at the other 
methionines. This would result in a protein split 
into cleaved fragments. Both the HGH and albumin 
35 produced by the above approach are "mature" 

proteins which start with methionines . Hence they 
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are not "real" mature proteins. 

The prior art approaches set forth above 
illustrate that a major and nearly universal 
problem in producing mature proteins is the con- 
5 struct ion of the actual genetic material to be 
inserted into transfer vectors. Procedures 
exist for preparing cDNA from mRNA Isolated from 
mammalian or other higher order animal tissue, but 
mammalian and higher order animal proteins are 
10 most often expressed as precursors and subsequently 
processed into the mature protein in cells of 
origin. The prior art has identified coli 
and yeast as microorganisms capable of processing 
precursors containing the "pre" portion , but this 
15 class of precursors excludes many of precursors 

of interest. The prior art thus has not identified 
a microorganism suitable for cloning mammalian and 
higher order animal genes which is capable of 
processing to mature proteins precursors of 
20 greatest interest. The prior art approaches attempt 
to solve the problem by constructing genes that 
code for mature protein. However, although 
procedures now exist for identifying nucleotide 
coding sequences for mature proteins, chemical 
25 synthesis of DNA sequences encoding mature 

. proteins or fragments thereof for use in hydrid 
genes is costly and time consuming, often 
prohibitively so. 

BRIEF DESCRIPTION OF THE FIGURES 
30 FIG. 1 shows inferred protein cleavage 

sites within the precursor of yeast a- factor, where 
"K" designates lysine and "R" designates arginine 
amino acid residues. 

FIG. 2 shows the cDNA sequence encoding 
35 preproparathyroid hormone and the unique Pvu II 
and Hinf 1 cleavage sites . 
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FIG. 3 shows certain portions of the 
nucleotide sequence of the pYEM-1 plasmid. 

SUMMARY OF THET INVENTION 
In the present invention, a method is 
5 disclosed for producing protein in yeast transformed 
to express a corresponding precursor containing 
a pair or triplet of basic amino acid residues 
located proximally and/or distally . adjacent to the 
protein portion of the precursor sequence comprising 

10 proteolytic processing by the yeast of the 

precursor at the site of such pairs or triplets 
of basic amino acid residues* The method comprises 
proteolytic processing by transformed yeast which 
contains an endopeptidase, designated herein as a 

15 trypsin-like enzyme or enzymes • The trypsin-like 
enzyme or enzymes proteolytically process the 
precursor at the site of such pairs or triplets 
of basic amino acid residues by cleaving at the 
distal side of such pairs or triplets. The 

20 method further comprises proteolytic processing 

by transformed yeast that contains an exopeptidase, 
designated herein a carboxypeptidase-B-like 
enzyme or enzymes. The carboxypeptidase-B-like 
enzyme or enzymes proteolytically process the 

25 precursor at the site of such pairs or triplets 
of basic amino acid residues by degrading- such 
pairs or triplets of basic amino acid residues 
remaining distally adjacent to the protein portion 
of the precursor sequence after the cleavage by 

30 the trypsin-like enzyme or enzymes. 

In the present invention, the above method 
is further disclosed for proteolytic processing of 
proto-proteins to mature proteins. Proto-proteins, 
defined with greater specificity infga , consist 
35 generally of precursor proteins in which the 
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protein portion of the precursor sequence is 
identical in structure ,to .the mature protein except 
for the abscence of the amino terminal and the 
carboxyl terminal in the precursor sequence.' The 
5 above method is also disclosed for proteolytic 
processing of certain Jion-proto-proteins . For 
example, the above method is disclosed for 
proteolytic processing of preproinsulin or 
proinsulin to mature insulin. The above method is 
10 disclosed for producing mammalian insulin 

generally as well as human, bovine, and porcine 
insulin specifically. . According to the method, . 
preprocalcitonin and procalcitonin may be 
proteolytically processed by transformed yeast 
15 to form mature calcitonin or a calcitonin relative 
in the case of animal calcitonin generally and 
human, bovine, and porcine calcitonin specifically. 

In the present invention, a recombinant 
DNA plasmid transfer vector useful for transforming 
20 yeast comprising a DNA sequence comprising, the 

preproparathyroid gene cDNA sequence is disclosed 
as well as the plasmid pYEM-1 and yeast transformed * 
by a plasmid comprising the above transfer vector 
and yeast trams formed by the plasmid pYEM-1. 
25 DESCRIPTION OF THE SPECIFIC EMBODIMENT 

Proto-proteins may consist of precursors 
for which DNA and mRNA encoding the precursors 
naturally occur in animals. This type of 
proto-protein is designated source natural 
30 proto-proteins. Proto-proteins may also consist 
of precursors in which synthetic DNA encodes the 
precursor. This type of proto-protein is designated 
source synthetic proto-protein. For example, by 
chemical synthesis, or alternatively by enzymatic 
35 cleavage, rearrangement and subsequent fusion, DNA 
can be synthesized so that the precursor which 
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it encodes has the cleavage properties discussed 
below. Production of mature protein might be 
enhanced by transforming yeast with synthetic DNA 
encoded for a precursor having repetitive sequences 
of the mature protein, each sequence being flanked 
by appropriate cleavage sites. 

Source natural proto-proteins are illustrated 
by, but not limited to, certain hormone precursors, 
including preproparathyroid (J. Habener & J. Potts, 
The New England Journal of Medicine (Second Part) , 
Vol. 299, No. 12, pp. 635-643 (Sept. 1978)), 
preprosomato statin (P. Hobart, et al.. Nature , 
Vol. 288, pp. 137-139 (November 1980)), AVP-NpII 
precursor to arginine vasopressin and its 
corresponding neurophysin (H. Land, et al., Nature , 
Vol. 295, pp. 299-303 (January 1982)), cortitropin 
B-lipotropin precursor to corticotopin (ACTH) 
and B-lipotropin (B-LPH) (S. Nakanishi, et al.. 
Nature, Vol. 278, pp. 423-427 (March 1979)), 
preproglucagon (P. Lund, et al.. Pro. Natl. Acad. 
Sci. USA , Vol. 79, pp. 345-349 (January 1982)), 
and pro-opiomelanocortin (POMC) precursor to 
B-endorphin and Met- and Leu-enkephalin precursor 
(M. Comb, et al., Nature , Vol. 295, pp. 663-666, 
(February 1982) ) . 

Source natural proto-proteins are also 
illustrated by melittin precursor (G. Suchanek, 
et al., Eur. J. Biochemistry , Vol. 60, pp. 309-315 
(1975); G. Suchanek, et al., Proc . Natl . Acad . 
Sci. USA , Vol. 75, pp. 701-704 (1978)) and serum 
albumin precursors (R. Lawn, et al., Nucleic Acids 
Research , Vol 9, No. 22, pp. 6103-6114 (1981)). 

As reported in the above citations, these 
precursors contain within their sequence at least 
one mature protein sequence. Where there is a 
single mature protein sequence contained in the 
precursor it is flanked proximally by a pair or 
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triplet of basic amino acid residues consisting 
of lysine and/or arginine and is flanked distally 
by either the carboxyl -terminal of the precursor 
or a pair or triplet of basic amino acid residues 
5 lysine and/or arginine. If there are several mature 
protein sequences contained in the precursor, at 
least one of the mature protein sequences is 
flanked proximally by a pair or triplet of such 
basic amino acid residues and is flanked distally 
10 by either the carboxyl -terminal of the precursor 
or a pair or triplet of such basic amino acid 
residues. Any precursor protein falling within 
this description is defined herein as a proto-proteJji/ 
whether it be source natural or source synthetic. 
15 As reported in the above citations in 

connection with observing the production of mature 
proteins in mammals and other higher order animals , 
the cleavage site located on the distal side of 
a pair or triplet of such basic amino acid residues 
20 is readily attacked by endopeptidases with 
trypsin-like activity. After endopeptidase 

cleavage > any residual basic residues remaining 

adjacent to and on the distal side of the mature 
protein are susceptible to degrading, i.e. 
25 selective removal, by exopeptidases with activity 
resembling that of carboxypepti&ase-B. 

Thus, for example, in prepropara thyroid 
hormone the mature protein is flanked proximally 
by the basic triplet lysine-lysine-arginine and is 
30 flanked distally by the carboxyl-terminal of the 
precursor. A single cleavage by a trypsin-like 
enzyme is sufficient to produce the mature hormone. 
In other proteins such as the glucagon precursor, 
two mature glucagon proteins are flanked both 
35 proximally and distally by a basic pair lysine- 
arginine. Combined cleavage by a trypsin-like 
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enzyme and degradation of the resulting carboxyl- 
terminal by a carboxypeptidase-B-like enzyme are 
required to produce the mature proteins . 



preteolytic processing by yeast of proto-proteins 
to mature proteins. In the method, transformed 
yeast naturally containing a trypsin-like enzyme 
or enzymes and a carboxypeptidase-B-like enzyme 
or enzymes, proteolytically release mature proteins 
from larger precursors. These enzymes will effectively 
cleave and degrade proto-proteins to mature proteins. 
This is confirmed by a trypsin-like cleavage, 
discussed infra, of prepropara thyroid hormone 
yielding mature parathyroid hormone. This is 
further confirmed by yeast processing its own 
mating factor, ct-f actor. (T. Tanaka, et al., 
J. Biochemistry , Vol. 82, pp. 1681-1687 (1977)). 
As shown in FIG. 1, the nucleotide sequence of 
a-factor shows that yeast naturally expresses a 
precursor containing four distinct codings for 
mature a-factor. Three of the four a-f actors 
in the precursor are flanked distally by a pair 
of basic amino acids residues. A trypsin-like 
cleavage in combination with a carboxypeptidase- 
B-like degrading naturally yields correctly 
processed C-termini for these three . a-f actors . 
After a trypsin-like cleavage, N- termini of the 
four a-factors are flanked proximally by a series 
of several glutamic acid and alanine amino acid 
residues. These latter residues are in turn removed 
by an aminopeptidase. The foregoing natural 
endopeptidase and exopeptidase activity in yeast 
in combination with the virtual uniform presence of 
pairs and triplets of lysine and/or arginine 
flanking mature hormone sequences in proto-proteins 
underlies the present invention. 



The method of the present invention comprises 



e 
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Although preproinsulin and proinsulin containing 
disulfide bonds are not proto-proteins as defined 
herein they will nevertheless undergo proteolytic 
processing in yeast transformed to express the 
5 preproinsulin or proinsulin, A pair or triplet of 
basic amino acid residues are located distally 
and/or proximally adjacent to the insulin-A- 
chain and the insulin-B-chain portions of the 
sequence which constitute the protein portion of 
10 the precursor preproinsulin said proinsulin 

sequence. The requisite disulfide bonds between 
the insulin-A-chain portion of the sequence and 
the insulin-B-chain portion of the sequence will 
be formed in yeast, (cf . the numerous examples 
15 of disulfide bond formation in yeast disclosed in 

M. Dayhoff, Atlas of Protein Sequence and Structure , 
Vol* 5 and Supplements 1, 2 & 3 (National Biomedical 
Research Foundation, Georgetown University Medical 
Center, Washington, D.C. 20007 (1972, 1973, 1976, 
20 and 1981))). Proteolytic processing at the site 
of such pairs or triplets of basic amino acid 
residues, will yield mature insulin from preproinsulin 
or proinsulin containing the disulfide bonds. 

in the absence of disulfide bond formation 
25 between the insulin-A-chain portion of the sequence 
and the insulin-B-chain portion of the sequence, 
proteolytic processing will yield insulin-A-chain 
and insulin-B-chain, which may be caused in turn 
to attach to one another by disulfide bonds by 
30 conventional means to form mature insulin. In 

this case, the insulin-A-chain and insulin-B-chain 
may be considered mature proteins and preproinsulin 
and proinsulin without disulfide bonds may be 
considered a proto-protein according to the above 
35 discussion of proto-proteins. 



CMH 
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Mature calcitonin contains disulfide bonds 
between the cysteines located at positions 1 and 
7 of the sequence, contains a carbohydrate attached 
at the sequence at position 3/ and the proline 
5 at position 32 has been ami dated to pro-amide 

while the glycine at position 33 has been removed. 
Preprocalcitonin and procalcitonin will contain 
the requisite disulfide bonds, (cf. the numerous 
examples of disulfide bond formation in yeast as 

10 disclosed in Dayhoff , supra ) . A carbohydrate 
will be attached at position 3 in calcitonin. 
Preprocalcitonin and procalcitonin will undergo 
proteolytic processing in yeast transformed to express 
the preprocalcitonin or procalcitonin. A pair 

15 of basic amino residues are located proximally 
adjacent to the 33 amino acid sequence, while a 
triplet is located dis tally adjacent to the 33 
amino acid sequence. It is expected that 
amidation of the proline located at 32 will 

20 occur in yeast after the cleavage distall to and 

degradation of the triplet. (cf . numerous examples 
of amidation in yeast as disclosed by Dayhoff, 
supra ) . In the event that a carbohydrate 
differing from the carbohydrate;of mature calcitonin 

25 is formed by the yeast, the calcitonin relative 
containing the differing carbohydrate may be 
converted to mature calcitonin by conventional 
means. In the event that amidation following 
cleavage and degradation is suppressed, the 

30 calcitonin relative lacking the amidation may also 
be converted to mature calcitonin by conventional 
means • 

By reverse transcription, cDNA can be 
prepared encoding any proto-protein of interest by 
35 isolating mRNA from tissues expressing the protein. 
Although many hormone and other protein genes have 
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already been cloned in E^. coli , yeast has heretofore 
not been the host of choice. cDNA not previously 
cloned in yeast can be rendered compatible with a 
yeast host by proper codon selection (J . Bennetzen & 
5 B. Hall, J. Bio Chem. , Vol, 257, pp. 3026 (1982)) 
and by site specific mutagenesis of the cDNA 
(G. Simmons, et al./ Nucleic Acid Research , 
Vol. 10, pp. 821 (1982) ) . 

Thus, one of the fundamental problems with 
10 producing useful mature proteins by recombinant 

DNA techniques has been simplified in the case of 
mature proteins derived from proto-proteins . 
cDNA, although readily available for most proteins 
by reverse transcription of mRNA isolated from 
15 animal tissue, will express the precursor of the 
mature protein. Yeast, but not E^ coli= , has the 
requisite enzymes to process expressed proto-proteins, 
preproinsulin, or proinsulin to mature protein or 
insulin . 

20 EXPERIMENTAL 

In order to demonstrate the present invention, 
the following experiment was carried out. 

The plasmid YEp-13 was obtained from 
Dr. Steven Henekoff , Fred Hutchinston, Dept. of 

25 Developmental Biology, Seattle, Washington, and can 
be constructed according to J. Broach, et al. , 
Gene , Vol. 8, pp. 121-133, (1979). The gene which 
encodes yeast alcohol dehydrogenase 1 was modified 
according to Hitzelman, et al., Nature (London) , 

30 vol- 293, pp. 717-722 (1981), allowing the 

isolation of the transcription signals. These 
sequences, including the cloning site, were 
provided by Dr. G. Aromera. The plasmid YEp-13 
was modified so that the tet R gene of YEp-13 was 

35 interrupted at the Bam HI site with the yeast alcohol 
dehydrogenase 1 gene promoter and RNA polymerase 
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stop sequences. A Hind III site between the 

latter two elements provided the cloning site. * 
These modifications of pi as mid YEp-13 were 
accomplished by methods set forth generally in 
5 U.S. Patent 4,237,224,* supra , and Methods of 

• Enzymology , Vols. 65 and 68 wherein such methods 
are reviewed • 

The cDNA sequence coding bovine prepropara- 
thyroid hormone, shown in FIG. 2 and further 

10 described in B. Kemper, et al., Hormonal Control 
of Calcium Metabolism (Ed. by D. Cohn, et al., 
published Excerpta Medica at Amsterdam, Oxford, 
and Princeton 1981) at pp. 19, was obtained from 
Dr. Byron Kemper, Department of Physiology and 

15 Biophysics and School of Basic Medical Sciences, 

University of Illinois-Urbana. This cDNA sequence 
was restricted with the enzymes PVU II and Hinf 1 
at the sites shown in FIG. 2. These enzymes were 
obtained from New England Biolaboratories , Beverly 

20 MA. The Hinf 1 site shown in PIG. 2 was filled with 
nucleotides using the enzyme DNA polymerase 1 
(the large fragment) which was obtained from 
New England Nuclear, Boston, MA. This modified 
sequence was then blunt-end ligated to Hind III linkers 

25 and restricted with the enzyme Hind III. The 

Hind III linkers and Hind III enzyme were obtained 
from New England Biolaboratories, supra . The 
resulting DNA fragment was then ligated into the 
Hind III site of the modified plasmid YEp-13 

30 forming a novel plasmid. This plasmid was designated 
pYEM-1. FIG. 3 shows certain portions of the 
nucleotide sequence of pYEM-1. The foregoing 
construction of pYEM-1 was accomplished by methods 
set forth generally in U.S. Patent No. 4,237,224, 

^ supra , the BLR M13 handbook, and Methods of 

Enzymology , Vols. 65 and 68 wherein such methods are 
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re viewed. 

After constructing pYEM-1, yeast cells 
were transformed with the plasmid using the methods 
of Beggs, Nature (Londoh) , Vol. 275, pp. 104-109 
5 (1978) and Hinnen, ef al,/ Froc. Natl. Acad.' of 

Sci. USA / Vol. 75, pp./ 1929-1933 (1978). Because 
pYEM-1 has the yeast leu 2 gene, the use of a 
leu 2 negative strain of .yeast was used in the 
transformation for the purposes of selecting 
10 successful trans forroants.. Yeast strain, X1069-2D, 
a strain of S a c ch a romycfesi cerevisiae defective 
in leu- 2 function, was obtained from the Yeast 
Genetic Stock Center,. .Univ. of Calif orinia- 
Berekeley. 

15 Of course any other defective yeast strain, 

including strains within* Saccharomyces' pombe and 
other species, could be used. All that is required 
is that a complementation system be established 
between the yeast strain and the cloning/expression 
20 vector and that the vector be stabily maintained 
in yeast. For example, a Trp 1 strain could be 
■ used if the Trp 1 gene was on the vector. To 
date, several stable transformation systems have 
been described. (A. Hinnen and B. Meyhack, 
25 Current Topics in Microbiology and Immunology , 
Vol. 96, pp. 101-117 (1981); C. Hollenberg,* 
Current Topics in Microbiology and Immunology , 
Vol. 96, pp. 119-144 (1981)). 

The transformed yeast cells containing 
30 plasmid pYEM-1 were grown in a leucine deficient 
media containing 5% glucose, yeast extract, 
yeast nitrogen base and other nutrients suitable 
for yeast strain X1069-2D. After 24 hours of 
growth at 30 °C, the media was collected and the 
35 yeast cells lysed. Bioassay was performed 

according to conventional techniques and PTH 
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radioimmunoassay was performed using Immuno 
Nuclear Corporation (Stillwater, MN) assays 
specific to the N-terminal, mid-molecule, and 
C-terminal regions of parathyroid hormone. The 
5 following table shows that both immunologically 

cross-reactive parathyroid hormone and biologically 
active parathyroid hormone is being produced in 
yeast. 

10 TABIiE 

PTH PTH PTH 

N- Mid- C- 

terminal molecule terminal Bioassay* 

RIA* region RIA* RIA* 

15 Cell lysate 

pYEM-1 16 16 16 10 

control 0 0 0 0 

Media 

pYEM-1 2 2 2 0.015 

control 0 0 0 0 

20 ^expressed in nanomoles/ml 

To confirm that correct processing had 

occurred, 50 ml of culture was prepared in which 

the parathyroid hormone producing yeast were 

35 

25 grown in media containing S methionine 

(80 y ci/ml) . After an overnight growth the 
cells were removed by centrifugation. The media 
was then incubated with specific N-terminal 
parathyroid hormone antibody. After two hours 

30 the antibody-antigen complex was recovered by 

centrifugation and washed three times with new 
media followed by an ether wash. This complex 
contained about 7,000 cpm of 35 S methionine 
incorporated into protein after TCA precipitation. 

35 This mixture was applied to a Beckman 890D 
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sequencer according to the methods of Mahoxiey 

and Nute, Biochem . Vol. 19 , pp. 4436 (1980)' and 

subsequently degraded 40 cycles. Sequence analysis 

35 

demonstrated that the S methionine was all 
5 contained in cycles number 8 and 18. In mature 
PTH, methionine appears only at positions 
8 and 18 in the sequence. If preproparathyroid 
hormone expressed by the yeast was left unprocessed, 
we would expect 35 S methionine in cycles 1, 2, 
10 7, 11, 14, 49, and 59 reflecting the appearance 
of methionine at positions -31, -30, -25, -21, 
-18, +8, +18 in the preproparathyroid sequence. 

The novel microorganism yeast strain 
X1069-2D transformed by novel plasmid pYEM-1, 
15 designated X1069-2D-pYEM-l, was placed on permanent 
deposit in the Northern Regional Research Center,. 
U.S. Dept. of Agriculture, Peoria, Illinois 
61604 on September 87 1982. The NRRL number for 
Xl069-2D-pYEM-l is Y-15153. The plasmid pYEM-1 
20 and the transfer vector contained therein may be 
removed from this novel yeast strain by known 

means* 

While the invention has been described in 
connection with a specific embodiment thereof, it 
25 will be understood that it is capable of further 
modifications and this application is intended to 
cover any variations uses, or adaptations of the 
invention within the scope of the appended claims. 
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CLATMS 

1. A method for producing protein in yeast 
transformed to express a corresponding precursor 
containing a pair or triplet of basic amino acid 
residues located proximally and/or dis tally adjacent 
to the protein portion of the precursor sequence, 
comprising proteolytic processing by the yeast 

of the precursor at the site of such pairs or 
triplets of basic amino acid residues. 

2. The method of claim 1 wherein the proteolytic 
processing by yeast of the precursor at the site 

of such pairs or triplets of basic amino acid 
residues comprises cleaving, by a trypsin-like 
enzyme or enzymes present in the transformed yeast, 
at the distal side of such pairs or triplets of 
basic amino acid residues. 

3. The method of claim 2 wherein the proteolytic 
processing by yeast of the precursor at the site 

of such pairs or triplets of basic amino acid 
residies further comprises degrading, by a carboxy- 
peptidase-B-like enzyme or enzymes present in the 
transformed yeast, of any such pairs or triplets 
of basic amino assay residues remaining distally 
adjacent to the protein portion of the precursor 
sequence" after the cleavage by the trypsin-like 
enzyme or enzymes. 

4. The method of claim 1 wherein the corresponding 
precursor is a proto-protein. 

5. The method of claim 4 wherein the proto- 
protein is source synthetic proto-protein. 



OUVl 
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6. The method of claim 4 wherein the proto- 
protein is source natural proto-protein. 

7. The method of claim 6 wherein the source 
natural proto-protein is bovine preproparathyroid 
hormone • 

8. The method of claim 1 wherein the protein 

is mammalian insulin and the corresponding precursor 
is mammalian preproinsulin or proinsulin. 

9. The method of claim 8 wherein the 
mammalian insulin and mammalian preproinsulin 

or mammalian proinsulin are members respectively 
of the group consisting of human insulin and 
human preproinsulin or human proinsulin , 
bovine insulin and bovine preproinsulin or 
bovine proinsulin, and porcine insulin and porcine 
preproinsulin or porcine proinsulin. 

10. The method of claim 1 wherein the protein 
is animal calcitonin or an animal calcitonin 
relative and the precursor is animal preprocalcitonin 
or animal procalcitonin. 

11. The method of claim 10 wherein the animal 
calcitonin or animal calcitonin relative and the 
animal preprocalcitonin or animal procalcitonin 
are members respectively of the group consisting 
of human calcitonin or human calcitonin relative 

1 and human preprocalcitonin or human procalcitonin/ 

4 bovine calcitonin or bovine calcitonin relative 

and bovine preprocalcitonin or bovine procalcitonin, 
and porcine calcitonin or porcine calcitonin 
relative and porcine preprocalcitonin or pro- 
calcitonin. 
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12. The method of claim* 1 wherein the yeast 

is Saccharomyces cerevisiae or Saccharomyces ? 
pombe • " 

13. The method of claim* 12 wherein the yeast 
is Saccharomyces cerevisiae. 

14. A recombinant DNA plasmid transfer vector 
useful for transforming yeast comprising a 

DNA sequence comprising the preproparathyroid 
gene ■ cDNA sequence . 



15. The plasmid pYEM-1 



16 . Yeast transformed by a plasmid comprising 
the transfer vector of claim 14 . 

17. Yeast transformed by the plasmid of 
claim 15. 
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